Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add problog rule inference to build as code check #260

Draft
wants to merge 29 commits into
base: staging
Choose a base branch
from

Conversation

sophie-bates
Copy link
Contributor

@sophie-bates sophie-bates commented May 30, 2023

Priority (v1)

  • Update pyproject.toml with ProbLog dependency.
  • Refactor build as code run_check method into smaller subchecks.
  • Use ProbLog predicates to enable prolog string to access subcheck results.
  • Write ProbLog rules to evaluate the check's overall confidence score.
  • Update BuildAsCodeTable to receive the information.
  • Compare overall confidence score with threshold to make a decision about check passing.
  • Update check_result to store confidence score.
  • Intermediate problog queries that allow a decision to be made on which deployment method to store data from.
  • Store results from each ci_service and compare confidence_scores to produce final result. Currently just using the first CI service.
  • Handle ProbLog errors properly.
  • Move subchecks to be external to BAC check.
  • Add additional subchecks.
  • Fetch project name from setup.py file. Timestamp check requires the project name. This is straightforward to fetch for projects where it's specified in the pyproject.toml, but less so for those that use setup.py. I developed an initial implementation for setup.py but later removed it because I realised it was only applicable for very specified scenarios. The other option was to execute setup.py to fetch the name, but for obvious reasons we shouldn't do that.

Subchecks

  • Timestamp of pypi package vs pipeline run - just need to finish with the github_actions function to fetch the 5 most recent runs of a workflow file.
  • Workflow trigger
  • Test deployment
  • Secrets

Note: for many of the sub-checks we've had to do a check for deploy action method vs deploy command as there can be multiple methods used in a workflow. Need to figure out a more streamlined way to handle this.

Later

  • Update expected outcomes for integration tests.
  • Address the assumption that a project must have a deployment method found - this limited the inference. However, the challenge associated with this is that most sub-tasks rely on the existence of a particular workflow or workflow step to check. Perhaps could store information per workflow? and build up evidence in this way.
  • We added tox -e release as a deploy command, however release is just a target pointing to a section in a tox.ini file, so potentially need to investigate parsing these files so verify that it contains deploy commands.
  • Most of the functionality of Poetry overlaps with pip anyway, and there's little differentiation between the two in terms of project setup - i.e. Poetry projects can still use the same external GHAs etc. to deploy. Could be potential to have generic python class and use the inference to instead make a decision about which is used but not until later in the check.

Notes:

  • Branch multiple-deploy-commands-collected-test stores some changes from the proof of concept for performing inference of multiple deployment commands (that was used in the Micronaut case study).

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label May 30, 2023
@sophie-bates sophie-bates marked this pull request as draft May 30, 2023 01:24
print(key, value)
if str(key) == "build_as_code_check":
confidence_score = float(value)
results = vars(build_as_code_subchecks.build_as_code_subcheck_results)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's okay to use this built in function for now. But we should keep in mind that this function would raise TypeError when __dict__ attribute could not be found within the target object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comment Nhan, I'm actually about to push a change that will mean I'm no longer using this anyway so hopefully it's all good.

self.check_results: dict = {} # Update this with each check.
self.ci_info = ci_info
self.ci_service = ci_info["service"]
self.failed_check = 0.0
Copy link
Member

@tromai tromai Jun 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused by this attribute failed_check. Could you document what this attribute is used for and what is the value that it contains?
From a first look, I was expecting a boolean type instead of a float. Is this attribute the value that we should return if a sub check fails?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's the value that should be returned if the sub check fails.

check_certainty = 1.0
# If this check has already been run on this repo, return certainty.

justification: list[str | dict[str, str]] = ["The CI workflow files for this CI service are parsed."]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could define the justification inside the if branch so that we don't need to do it when self.ci_info["bash_commands"] is not available.

…ermine which results to store

Signed-off-by: sophie-bates <[email protected]>
# TODO: Investigate using proofs

# Check whether the confidence score is greater than the minimum threshold for this check.
if confidence_score >= self.confidence_score_threshold:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this line, if a CI service has a confidence_score larger than the threshold, but smaller than other CI services, wouldn't the check return the result for the smaller confidence_score?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I can update this implementation to store all of the required information (i.e. deploy commands) for each ci_service and then populate the BuilasAsCodeTable and AnalyzeContext with whichever ci_service has the highest confidence_score?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, however I don't understand why AnalyzeContext needs to be updated?

Copy link
Contributor Author

@sophie-bates sophie-bates Jun 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just meant where the ci_info information is updated, i.e. here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's the inferred provenance representation. We don't update ci_info itself.

Signed-off-by: sophie-bates <[email protected]>
Signed-off-by: sophie-bates <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants